AI029
Reinforcement Learning: An Introduction
Function Approximation and Policy Gradient Methods
Learning Objectives
- Identify the limitations of tabular methods in high-dimensional state spaces.
- Formulate the value function approximation problem using Mean Squared Value Error (VE).
- Derive the Policy Gradient Theorem and its application in the REINFORCE algorithm.
- Analyze the benefits of Actor-Critic architectures for reducing variance in policy updates.